Attorney Docket No.: 2839/115

#### IN THE UNITED STATES PATENT AND TRADEMARK OFFICE

Applicants: Airey et al. Examiner:

Wang, Jin-Cheng

Serial No.:

09/614.363

Art Unit: 2628

Date Filed:

July 12, 2000

Conf. No :

2211

Title:

DISPLAY SYSTEM HAVING FLOATING POINT RASTERIZATION AND FLOATING POINT FRAMEBUFFERING

Commissioner for Patents

P.O. Box 1450 Alexandria, VA 22313-1450

#### DECLARATION OF JOHN M. AIREY UNDER 37 CFR §1.132

Dear Sir:

In support of the accompanying response to the February 20, 2008 office action in the above-referenced application, I hereby declare as follows:

- I am the first-named joint inventor of the subject US patent application 1. (hereinafter "the Application"), which is assigned serial number 09/614,363 and was filed on July 12, 2000.
- I am submitting this declaration to show that some details of my invention 2. claimed in the Application are described in US patent number 6,567,083 (hereinafter "the Patent"). In other words, using the language of MPEP 715.01(c), the Patent is a publication of my own invention. To that end, this declaration includes facts showing that:
  - · I jointly made the invention upon which the relevant disclosure of the Patent is based,
  - · I was associated with Daniel Baum, the first named inventor of the Patent, and

· Mr. Baum derived his knowledge of the relevant subject matter from me.

#### I jointly made the invention upon which the relevant disclosure of the Patent is based

- 3. In 1996, I was employed at Silicon Graphics, Inc. ("SGI") in Mountain View, CA as a staff engineer. In the summer of 1996, I started work on building a machine that used floating point rasterization, including floating point scan conversion, and a floating point framebuffer. The internal code name for the machine at SGI was "Bali." The Application describes and claims various aspects of Bali.
- 4. The general architecture of Bali is described in the document entitled "Bali Offsite," which is dated November 12, 1996 and attached as Exhibit A. This document was discussed among SGI personnel at an off-site meeting on November 12, 1996. Among other things, this document shows:
  - A reference to the "R Chip" on the Agenda page (immediately following the cover page), the Contents page, and page 22. The R Chip refers to the portion of the Bali device performing rasterization,
  - · A chart of software simulating Bali on page 21,
  - A block diagram of the rasterizer (i.e., the R Chip) and scan converter on page 24,
  - · a block diagram of a floating-point block on page 29, and
  - floating-point multipliers and floating-point normalizers/adders on page 35.
- A number of other documents show specific details and aspects of my invention.
   These documents were prepared either by one of the joint inventors of the Application or by

someone with information that originated from one of the joint inventors. Some of those documents are discussed paragraphs 6 to 13.

- 6. "Bali Floating Point Representation" A document dated December 3, 1996 discussing a 16-bit floating point format and internal variations used inside the R chip pipeline, and submitted herewith as Exhibit B.
- 7. "Mapping RenderMan to OpenGL" A document I authored, dated January 24, 1997, that includes software code to be used with Bali hardware to have floating point processing through the R Chip pipeline, and submitted herewith as Exhibit C
- Document containing Code headed "head 1.1" A document I authored, dated
   January 24, 1997, that describes rasterization of pixels, color processing using floating point
   numbers and framebuffers, and submitted herewith as Exhibit D.
- 9. "Extended Range and/or Precision" A document dated February 15, 1997 and authored by my co-inventor, Mark Peercy, and me. This document describes calculations using floating point numbers and which format of floating point numbering to use, and submitted herewith as Exhibit E.
- 10. "Tiny Floating Point" A document dated February 15, 1997 and authored by Mr. Peercy and me. This document describes disadvantages of fixed point arithmetic and further describes a tiny IEEE like floating point format, and submitted herewith as Exhibit F.
- 11. Document containing Code headed "head 1.5" A document with versions ranging from dates from July 15, 1997 to July 28, 1997 and authored by John Paul Alex, who was an intern working for Mr. Peercy and me. This document describes calculations using

floating point numbers and which format of floating point number to use, and submitted herewith as Exhibit G

- 12. Email from me to Mr. Baum dated August 14, 1997. In this e-mail, I mention the s10e5 pipeline and high-speed data transfer from the host to the frame buffer, the frame buffer to the texture memory, and texture memory to the frame buffer. This document is attached as Exhibit H.
- 13. "Invention Disclosure" A document dated September 22, 1997 that is an invention disclosure submitted to SGI Legal Services, and submitted herewith as Exhibit I. The document describes a 16 bit floating point format for texture store and framebuffer. The document further states that the invention was conceived about a year prior to the submission of the Invention Disclosure. This document also notes that Mr. Baum is the department manager for visual systems. Mr. Baum is not listed as an inventor.
- Mark Leather, who was a member of the SGI technical staff from 1989-1997 but not an inventor on the Application, further corroborates my invention of the subject matter.
- 15. During a deposition conducted on December 7, 2007 (see Exhibit J), when asked if it was his "understanding that [the idea of multipassing data through the frame buffer] involved the use of floating point formatted data," Mr. Leather responded: "Yes, that was my understanding."
- 16. Further, in response to the question "And did you understand that that was [Mark Peercy's] concept that he was working on at SGI?," Mr. Leather responded: "I believe it was him and John Airev."

- 17. During the deposition, in response to the question "But do you recall the use of a floating point frame buffer as it related to Bali?" Mr. Leather responded: "Yes." Further, in response to the question "And that was Dr. Peercy and Dr. Airey's work?" Mr. Leather responded: "Yeah."
- 18. During the deposition, Mr. Leather recollected that a second project at SGI had a floating point frame buffer, but when asked if the "other work was an extension of Airey and Peercy's work," Mr. Leather replied: "Yeah."

#### I was associated with Daniel Baum at the time of my joint invention

19. Mr. Baum was my supervisor as the Hardware Director for the Bali project. We thus both worked together at SGI at the same time—among other times, while I was developing the invention claimed in the Application.

#### Daniel Baum derived his knowledge of the relevant subject matter from me

- 20. In his capacity as hardware director of the Bali project, Mr. Baum was responsible for understanding the technology and, as such, he was a decision-maker on the direction, schedule, and features of the project. In addition, in this capacity, Mr. Baum was responsible for updating company executives. To fulfill these and other responsibilities, Mr. Baum needed to be familiar with the software simulation, the floating-point scan converter, the floating-point frame buffer, and their technical advantages/benefits.
- My co-inventors and I therefore described various aspects of the invention and the simulation to Mr. Baum so he could make the appropriate decisions. For example, Mr. Baum

was present at the above noted *Bali* offsite meeting of November 12, 1996 (see Exhibit A), as well as other internal *Bali* meetings. In addition, in his capacity as Hardware Director of *Bali*, Mr. Baum had access to many of the documents discussed herein.

- 22. As noted above, on August 14, 1997, Mr. Baum sent me an e-mail asking about opportunities to differentiate SGI. I responded, as noted above, by discussing the s10e5 pipeline and floating-point frame buffer (see Exhibit H). I thus conveyed details of my invention to Mr. Baum in this e-mail.
- 23. Danny Loh, one of the joint inventors of the Application, worked on the software simulation that appears to be mentioned in the Patent. Specifically, during a deposition conducted on November 9, 2007, Mr. Loh stated "So the question you asked me before, did I work with John Airey on the floating point on Bali?.... Yes." Mr. Loh further stated that he "wrote sort of the simulation framework to evaluate floating point formats in the frame buffer." When asked if Mr. Loh was doing that work with John Airey, Mr. Loh replied "with John Airey and Mark Peercy." That portion of Mr. Loh's transcript is in the prior noted Exhibit J.
- In his capacity as Hardware Director, Mr. Baum should have known about the software simulation.
- 25. It is my understanding that before the filing date of the Patent, SGI did not have another floating-point project that was independently derived. Instead, as Mr. Leather stated under oath, if any such projects existed, they were derived from the principles of my joint invention.

Attorney Docket No.: 2839/115

26. I hereby declare that all statements made herein of my own knowledge are true, that all statements made herein on information and belief are believed to be true, and further that these statements were made with the knowledge that willful false statements and the like are punishable by fine or imprisonment, or both under Section 1001 of Title 18 of the United States Code and that such willful false statements may jeopardize the validity of the application or any patent issuing thereon.

John M. Arrey

Dated: May 151 2008

02839/00115 860759.1

## Bali Offsite 11/12/96

and book

Silicon Graphics Inc. Company Confidential

### Agenda

| 9:00  | BREAKFAST                 |
|-------|---------------------------|
| 9:15  | Introduction              |
| 9:16  | G Chip                    |
| 10:45 | BREAK                     |
| 10:55 | N Chip                    |
| 11:15 | Documentation Environment |
| 11:45 | Design Flow               |
| 12:00 | LUNCH                     |
| 12:45 | R Chip                    |
| 2:15  | Software                  |
| 2:30  | M Chip                    |
| 2:50  | BREAK - Driving Range     |
| 3:30  | D Chip                    |
| 4:20  | System                    |
| 4:50  | Wrap-up                   |
| 5:15  | AJOURN                    |
|       |                           |

Silicon Graphics

Company Confidential

### Contents

| G Chip<br>GE Ucode<br>GE vector and scalar<br>PE<br>GRU                        | 1<br>3<br>6<br>10<br>13                      |
|--------------------------------------------------------------------------------|----------------------------------------------|
| N Chip                                                                         | 14                                           |
| Documentation Environment                                                      | 17                                           |
| Simulation Environment                                                         | 21                                           |
| R Chip Scan Conversion Texture Texture Filter Lighting FRU SRAM Pixel Response | 22<br>24<br>27<br>29<br>30<br>38<br>44<br>48 |
| Pixel SW<br>Geometry Pipeline                                                  | 52<br>56                                     |
| D Chip                                                                         | 58                                           |
| System Configs                                                                 | 62                                           |
| M Chin                                                                         | 65                                           |

Silicon Graphics

Company Confidential









Big changes:
Signels vector unit
Commodity Scalar unit
Special needs:
By Cast.
Special needs:
Sp





O HIND of four vector sub units (FONG WES SIMP) O Round Robin thread distribution (amongst 16.143) Host commands or scalar unit thread drives Vector Unit Programming Model

O Vector thread forks a single or a chain of vector code fragments - UNING CODE Vector generates output to either the PE or the scalar unit (dnem) Vector Unit Programming Model (cnt'd) O Vector thread contents Condistal of

SGI Confidential

Voctor Unit Programming Nodel (cnt'd)

Code compass

- soul an investment and a compassion, norn a quad

- soul an investment and a compassion, norn a quad

- soul an investment and a compassion of a compas

Vector Dist Programming should (cité)

Insurante am (354 la LockTr wort R.

- Insurante am (354 la LockTr wort R.

- Insurante am (354 la LockTr wort R.

- Insurante ar Insurante since have better a refree

- agents and terminate of many sheets, soon have better a refree

- agents and provide and through and

- the spreade and through and

Vector Unit Programming flocial (cnt d)

opening

openin

opening

opening

opening

opening

opening

opening

opening

4

SGI Confidential

HIGHLY CONFIDENTIAL

# SGI Confidential

| <br>Geometry Performance                                                                                              |
|-----------------------------------------------------------------------------------------------------------------------|
| <br>0 Host to perform tasks beyond the scope of transport layer                                                       |
| o Invectiate (PIO) vs fist (PHA)                                                                                      |
| <br>o. Use meth titze of 4 for performance evaluation.                                                                |
| <br><ul> <li>Vertex array as primary transport model</li> </ul>                                                       |
| <br><ul> <li>Raise up har for canonical banchmarks: defalts to normal<br/>normalisation and sexture inform</li> </ul> |
|                                                                                                                       |
|                                                                                                                       |

o Material cals reached couple of hundreds in a frame light wasse to per vertex light

O No use of texgen

Ran across seven representable applications/demos
 Most typical nest lase used: 4.
 Xform state changes are at about once per nesth in a non
 Ristend CVD application.

Geometric Statistics

Geometry Performance (cni'd)

o bhild figure bised on vector incole cycle count (cpu' - 550Me beast
DAA - Eujeo lea 1262/sec)
Kristlan

installor.

- vertoni, no light no booture (35)

- vertoni, hinkle light and teature (12),538, v34)

- vertonic bool light, book lover, no booture (13),53)

- vertonic local light, local vertone and teature (13), 139, v3)

- end ment; it cuts, 2 vertone er and teature (13), 139, v3)

| _                    | T                             |                         |
|----------------------|-------------------------------|-------------------------|
| Ly CBV               | N = 4-5                       | 25<br>50                |
| 4Vs<br>ONV/sec)      | 38.0<br>28.5<br>21.5<br>20.0  | 2.0                     |
| Sycles)              | 2822                          | 400                     |
| DWA(Est)<br>(NV/Sec) | 85.7<br>57.5<br>50.0<br>5.75  | 2 2                     |
| CPU (In)             | 48.8<br>27.1<br>34.3<br>27.1  | 2 2                     |
| a a                  | vertex)<br>vertex)<br>vertex2 | eval mesh<br>eval coord |

Geometry Performance (cnt.d.)

On available vector as use

Professional continues men are get hat.

On the aggress assumption in feasing pine assumption to the present of the farmer of the present of the farmer o







Ball Office II

Scalar Unit Options

Totals instruments has a 200 Min DSP core that looks like it force the core requirements has a 200 Min DSP core that looks like it force for requirements has a 200 Min DSP core that looks like it force for the core that looks like it is considered to 200 Min DSP core that looks like it is considered to 200 Min DSP core that looks like it is considered to 200 Min DSP core that looks like it is considered to 200 Min DSP core that looks like it is not curt ASIO vendor.

• Inflinite option if it is not curt ASIO vendor.

Scalar Unit Options
Scalar Unit Options
Option 3:
Off the shelf micro-processor external to the G chip.
Processors with 200Mz operation and floating point are writinible.
• On the shelf micro-processor pobable has better

Scalar Unit Options
Scalar Unit Options
Acces to provided by a third party design house, internal additional Agic vended by a third party design house, internal additional Agic vended rown design. Includes using 11 with a different Agic vended rown or could work to the design frow Wis.

\*\*Note that we fould the party design house that we could work do tot design frow Wis.

\*\*A internal MIRS part or longe-grown design has information complantly act and the.

\*\*A internal complantly act and the.

\*\*A internal Complantly was out themseling that cone to another worldoor and themseling that cone to another worldoor.

SGI Confidential

\_

こうとがなっているのですのであるのかのできるのではなり、これできていている



SGI Confidential

8

rog 11/9/96



### PE Design Notes

Rob (Skip) El-Kareh

Gloria (Globot) Lau

Erik (Bowzer) Lindholm

Paul (Ironman) Thilking

Mark (Flyboy) Young



PE / Top Level Block Diagram

10

November 11, 1996 4:05 pm

SGI Confidential

Bali Offsite Notes



Geometry Block / Internal Block Diagram

- 1/

November 11, 1996 4:05 pm

SGI Confidential

Bali Offsite Notes

Drebin

#### PE Functional Goals

- · GE interface to Omega Network
- Includes interface to Convert/Merge
- Pixel conversion, reformatting, and distribution
  - Input (IEEE Float, U16, U8), start aligned, no garbage
  - Output (S10E5, U16, U8), first pixel word aligned start of each packet .
- Bitmap unpacking, reformatting, and distribution, non-opaque and opaque, pixel packet per bit

er 11, 1996 e 00 pm

#### Rasterizer State Management

- Write / WriteThrough
- Dirty Bits per State Set
- R Chip Synchronization
- Broadcasts SyncID to R chips when Geometry is received before SyncFlag is cleared

Used for texture download

Geometry primitive processing and distribution

- Standard OpenGL primitives (with swap)
- PolygonMode, line and point generation
- PolygonOffset, dz/dx & dz/dy, with scale and bias (may incur slight penalty)
- LineStipple counter, major and true lengths, with first vertex of each segment

- Clip exception processing (sent to GE,

7 WILL incur significant penalty)
Waget who who wivestees to we con

- Backface cull before clip
- Zmin/Zmax/Hitflag for select
- Bounding box coverage generation per primitive (including lines)
- Feedback on Host or in GE

### Assumptions

ner 11, 1996 4:05 no

- R Handles: stippled line rasterization, wide stippled lines, wide points, AA lines and points
- Bounding Box rasterizer selection is good enough for lines

#### Open Issues

ber 11, 1984 4:05 pp

- Who handles rasterization of Constant Multisample Points (Lightpoints)
- What is the true benefit of R state shadowing? What is the state size? Can it be grouped into sets? Could it be better managed in N Chip?

Material changes for per vertage lighting ...?

Ball Office N

240

SGI Confidential

20 4 7 20

nober 11, 1994 4:05 pm







13









notice imbolance of is s.

Could use 4 clips, but no fair division.

SGI Confidential

#### N-Chip Issues

- landwidth and Latency We can get more bandwidth with deeper FIFOs. We can get better latency with shallower FIFOs.
- Packet Ordering N-Chips can guarantee order within a channel, but there is no coordination between channels.
- System-level Deadlock
   Example: network clogged by texture requests so it cannot send texture responses.

One solution: virtual channels...one class can go when the others are blocked,

- \* Flow Control
  Throttle wire or credit scheme,
  Will need separate one for each virtual channel.
- \* Error Handling At least want error detection. Correction: ECC or retry protocol.



#### Packet Ordering

Only one path from Node A to Node B within Omega network, so packets on any one channel arrive in order.

But there is no ordering control (and wide latency variation) between 16-bit channels.

Most transfers are self-descriptive and do not care about ordering:
Teamire requests
Teature responses
Video responses
R + G R Hen control

Others will have to cope using sequence numbers.

Because of latency variations, better to resequence after buffering...no waiting!

#### System-Level Deadlock

As network clogs, we need a way to relieve backpressure.

Must give precedence to "downstream" messages (e.g., responses instead of requests).

Virtual channels (sharing the same wires) is traditional way to allow high-precedence message to flow despite snarl of low-precedence message.

Virtual channels imply:

ual channels iniay;
Separate buffering
Separate flow control
Intermingling between virtual channels
(each transfer identifies its virtual channel)

We need to identify all possible deadlocks!

We need to categorize packets into virtual channels to assign precedence.

Texture request Triangles
R + G Throttle Vertexes
Pixel read data Read rec

SGI Confidential

#### Flow Control

Sometimes node cannot accept packets. Example: R-chip can be overwhelmed by texture requests or pixel reads or triangles.

Need to be able to refuse packets.

Triangles handled by higher-level mechanism: Throttle packets from R+G.

Others handled by flow control wires. + one per YC.

Two schemes:
Throttle: shut up ASAP.
Credit: keep track of receive buffers

Credit is more effective...hides the latency...but it presumes one set of receive buffers. If we have multiple FIFOs behind each input, credit scheme

Throttle scheme always works, but must set high-water-mark lower to skid due to latency.

#### Error Handling

Source-synchronous signalling can have errors unless it is tuned perfectly.

During bringup, "perfect tuning" can take months to achieve. Error correction can allow this effort to proceed in parallel with debug, not in series.

At least two possible schemes: Error-correcting codes Retry

Before we choose, we need to understand what kind of errors we expect.

ECC requires less hardware, but fixes fewer errors. It is well-sulted to single bit errors, not bursts,

Retry requires a lot of hardware...blg send buffers. More robust in case of many errors.

In both cases, handling errors at a finer grain requires less RAM...correct or retry micropackets...

#### N Chip Estimate

SGI Confidential

 Web site creeted using third party frame—to—himi utility **Ball Documentation Methodology** Mark Leather

# Formatting guiddines were only loosly achiered to

What did we learn?

Kona register spec language

# SGI Confidential

How was Kona documented?



A node linking facility will prevent duplication of work wherever possible, a.g. the

ociption of "Z Buffering" only needs to be entered once

Defabase used as input to a new tool which will create a complete wab site







SGI Confidential

18

Ball documentation methodology







| Nods ettributes are assigned e priority - "optional", "recommended", "required" or | It is very important not to break the document tree when new feetures are added | <ul> <li>The language should be ganeral enough to allow new node types and node<br/>attributes to be added or removed at any time during the project cycle</li> </ul> | How do we allow the spec language to evolve |
|------------------------------------------------------------------------------------|---------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------|
|------------------------------------------------------------------------------------|---------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------|

Ambutes to be deleted, are re-printibled as "half recommended".

What one or more was suff in a proper and in warring free — then upgrade the new attribute to 'required'; or enmove ted attribute in the properties, or enmove ted attribute.

A was the was the dead when a dead of the for a few one homomental colores.

New required attributes are initially prioritized as "11

A web zite will give a dated list of all spec larguage changes silven ansters - cerebonial

# SGI Confidential



Trial layouts, etc.

### Bali Simulation



Nome Page: http://b7.asd/bali/trees/doc/cbips/r/index.html (M Chip) Patrick Law

143 Mfz 512XX32 SGRAMS; also intl6 and 4Mx16 SDRAMs Need to simulate new fog, lighting On-the fly texture compression Real "H" network simulations 428 Mfz clock generator Waltiple of injection External Parts H chips :-)

Area-optimized fast SRAMn

Requirements

HIGHLY CONFIDENTIAL

SGI Confidential

64 bits per clock texture dominand Lindles of Tay Council

I pixel per clock polygon fill rate (2 tex, alphn-blend, z baf)

1/2 pixel per clock accusulation 1/6 pixel per clock 7x7 convoive

1 pixel per clook drawpixeln 4 pixel per clock clear

143 MRz clock (SDRAM vendor consensus)

Extended range & precision color components (side5 flost)

New Functionality

Fog im 8 function of Range & Elevation

Per-pixel 11ghting 2 Active Textures Space-verying convolve Performance

# R Block Diagram





HIGHLY CONFIDENTIAL

S0861538

2



SGI Confidential

24

1

250Kg let Lean cow. 350kg 2rd. May need to 40 to multi-parks bean cow to save gates.

SGI Confidential







SGI Confidential

26

















SGI Confidential









SGI Confidential

وع

Functionality - Alpha and Mask Logic - Texture Envi - Lighting -R Lighting. Bali Offsite 2-R Lighting John Airey
Avi Bleiweiss
David Blythe
Bob Drebin
Erik Lindholm
Mark Peercy
David Tannenbaum Amy Migdel David Weng

Bali Offsite 2

- dithered alpha-to-mask conversion Features, cont. - two texture environments - dithered sample mask - altitude base fog - range based fog - alpha function - alpha-to-one -R Lighting-

- texture as material color or specular sh

environment (cube) mapping - tangent space bump mapping

per-pixel lighting (Blinn-Phong)

- full speed (143 MPixs/sec)

Features

-R Lighting

SGI Confidential

ali offsite 2-

- spot light attenuation - normal vector bypass - distance attenuation - two-sided lighting

Bali Offsite 2

Fog Calculation

range \* (x<sup>2</sup>, x<sup>2</sup> + y<sup>2</sup>

datasitude \* Altitudeye - Altitude

f = sec | Postew-Post | range \* Adatatitude |

Pog value are determined by indealing LT at
altitude of eye and altitude of fragment.

Precision

Boys, Loys, Roys - 815

Colors

Att - 0.16

teature as normal - class to 915

Texture Environment

Or = A + CG + B + CG + D

Implements lighting equation

Implements sighting equation

Implements and the control of the

SGI Confidential

Bali Offsite 2-

32

## LTF Block Diagram



## **Lighter Block Diagram**



ic = interpolated color t0/t1= textures \*\_f=front material color \* b=back material color

SGI Confidential

34

## **Texture Environment Block Diagram**



Ct: texture color At: texture alpha Cf: lit fragment color Cc: constant color Cb: bias color

Replace Fragment: Replace Texture: Modulate: Blend: Decal:

Add

D = 0A = CtB = 0A =-Ct B = Cc A =-At

A = 1

D = 0D = Cf B = At D = Cf B = Cc D = Cb

## Fog Block Diagram



## Alpha and Mask Block Diagram



## **FRU Functions**

Stencil, depth, blend, logic-op non-multisampled pixels Supported fb formats: S1DE5, RGBA12, LA16, C116

Accumulation buffer

Fast clear

Pass fragment info to M chips, receive resolved color from M Automatic & explicit Z cull

In-place copy & blend Calligraphics

Partial resolve for multipass shading

# **FRU Performance**

One depth/stencil tested, blended pixel per clock

One accum buf operation every 2 clocks Fragment packet from ltf = 180 bits Desired 1 pixel/clock performance = 143 MPixels/sec Pixel = (RGBA, SZ) = 16\*4 + 32 = 96 bits RW of a pixel = 192 bits/clock = 3400MB/sec pea

= 3400MB/sec peak

FRU sdram Memory Map







## **FRU Issues**

Better solution for pixels-in-flight problem? FRU TRB data Final supported pixel formats/conversion Alternative implementations Additional blend functions? Tiling & load balancing

1280x1024x72Hzx6bytes, 8R)

-1054MB/sec out

\* 80% = 3660 MB/sac

BW = 163MEz \* 32bytss 8 32-bit wids SDRAMs - 360MB/sec in - 360MB/sec out - 70MB/sec out

Vital Statistics

-R SDRAM Interface -

-Bali Offsite 2-

Memory size per R = 16MB 32MB 128MB -1816MB/ssc in/out fr (ref: doc/chips/r/bw.txt)



Display request can lock the current SDRAM to allow random access within the same page Separate queues for SDRAM bank A and B to hide page miss latency Dedicated queues for texture and framebuffer acc Command queue based interface allows fire-and-forget protocol 256-bit atom access allows zero overhead bank Round-robin arbitration with flow control Display request has the highest priority Design Goal -R SDRAM Interface

## 2¥ + 5 SGI Confidential -Ball Offsite 2-

-Bali Offsits 2-Burst size = 2 to allow issuing new page activate command in the middle of transfer? Performance optimized arbitration - framebuffer (A+32)\*8/32\*8 Max display response latency Issues texture prefetch A/32\*8 - display processor A/32 - G FIFO read A/32 - G FIFO write A+32 Support 4-bank SDRAM? -R SDRAM Interface. Bus bandwidth

Three 16 bit @ 143MBz\*3 input and output channels

-R Metwork Interface

BW = 143MEr\*3\*2bytee\*3 porte\*75% = 1930MB/sec Vital Statistics

R chip input budget

assume 15% tex req overhead
-> 1170 MB/sec for texture data (responses)

R chip output budget

- 48MB/sec out video responses (1280x1024x72Enx6bytes, 8R) - 1561MB/sec out texture req & responses

11/11/96 d2 FFO with GO FRO WIRE G3 FWO welks eync D update N interface - rcv

> -Ball offsite 2-Chrottle-back message for G->R flow control Need to re-sort packets from a specific G Match or provide higher BW than Bali net Half tile response per (R) cycle - we can achieve -0.7 tile per cycle Design Goal - rcv texture response packet
>  display request packet
>  G FIFO write packet Individual flow control for sound robin among channels -R Network Interface-

SGI Confidential

-Bali Offeite 2-

Company of the Compan





N interface - xmt

# R Block Diagram - SDRAM centric





RAM: 256 x 32 bits for FIFO FFLOPS: 64 current\_request 9 fifo rd pointer 9 fifo wr pointer

1 state bit for FSM

1 request\_valid

1 rsp\_busy

Clock count: 1 to get to front of FIFO,

1 to get out of FIFO into Request FSM

2 in the FSM.

## SGI Confidential

48







## Scalar core

Scalar core programming environment

Overview

Pixel operation performance

Throughput Scalar Core Pixel path

Open issues Setup

Floating point support C com Simulator w/ interactive debugger fast immediates & bitwise ops 200 MHz performance RISC instruction set We are assuming

C compiler

Immediate to G chip: 1 cycle Immediate + mask to PE: 2 cycles Fast interface to G & vector units Register writes

G chip register reads Round-trips / semaphore checks Low entry-point overhead What if the core is external?

unocessary caller & arm weed to Law At to to 48

# Pixel Peak: Useful stuff

(Assuming DUPLO interface, 8 R's) RGBA12 copy 5x5 convolve 11x11 convolve

Pixel Peak: Simple stuff

325 MPix/s 283 MPix/s 1000 MPix/s

RGBA8 DrawPixels RGBA8 ReadPixels RGBA12 copy Not in-place

In-place

536 MPix/s

350 MPix/s 121 Mptx/s

SGI Confidential

5 ۲

## E Pixel Setup:

Input/output types SDRAM addr Write from host:

~250 cycles

Dead bytes for scan lines

Input/output type

Converter:

Command table:

~80 cycles

~20 cycles

IDMA Engine: Address, width, skip

GE

GIR

Pixel Setup:

Write from host Debubbler: ~100 cycles ~150 cycles 150 cycles

~80 cycles

33

Source type, dest type

ODMA Engine:

SDRAM addr, x, y, width, height

Fetch unit: Convertor:

Read to host Total:

x, y, width Reload from scalar: Copy & Read ops: Nothing?

~150 cycles

0 cycles?

Pixel Setup: LTF & IMP

8 registers 8 registers چ TF Pre-LUT scale/bias Configure Load from SDRAM LTF LUT/histogram

8 PE shadow reg writes Post-LUT scale/blas IMP Convertor LIF

8 registers ~16 cycles 4 registers 3 registers 1 register

IMP Blending Mode IMP Color Mask

SGI Confidential

Pixel Setup: R Chip to TFI

~8 registers -8 registers incoming format Promoter

(convolution / color matrix) Texture filter mode TFI Filter mode

-4 registers

TFI Multiply/accum 121\*4 coefficients (max) 242 words 4\*4 coefficients (matrix) 8 word

Compute 777 Should happen on mode change

Other lazy validation scheme? Leave PE shadow in geometry state? 2 copies of shadow on PE?

51 registers R Chip Total

Excluding convolve coefficients

Pixel Setup:

Total:

If written from scalar:

Writes alone:

Get from EMEM:

~100 cycles

Polymode etc. Restore R state

16 cycles?

16 cycles

is this necessary every time?

Pixel context switching expensive 4

## Useful draws Pixel Setup:

Useful copies

Pixel Setup:

CopyPixels w/ LUT & scale/bias

800 cycles ??? cycles 1.5K cycles 700 cycles DrawPixels (no transfer modes) G chip setup:

DrawPixels (convolve) R chip setup: State restore: Ops/sec: <u> Total:</u>

133 K

R chíp (download): R chíp (draw): Oraw rectangle: State restore: G chip: **Fotal**:

ops/sec:

700 cycles 800 cycles 800 cycles 200 cycles 777 cycles **2.5K cycles 80 K** 

CopyPictors

CopyP 200 cycles 800 cycles 200 cycles 1.2K cycles 1800 cycles 400cycles 2.2K cycles 90 K Configure R chip (twice): Draw two rectangles: Copy to scratch: Configure R chip: Draw rectangle:

SGI Confidential

Ops/sec:

# **Biggest Unresolved Issues**

Are our expectations realistic? What if it's external? Code space / cache requirements Pixel Path Scalar core

More specifics on setup
Roffware design, eyeke ovf5
Strip control eyeke ovf5
Hydre ster management
Mode change spreas
What can we simplify?
Pred textures
Multiply accumulate
Multiply accumulate

Soft-ware Development Environment

O Gothware daubter for mode destigation

O Vector code baselate, tasselate back and

O Solate and vector code separated, shreat teaders

Geometry Path (cnt'd)

o waxis me

- better surface to verse thegratio

- tro-pins lightly supert

- to correct paths which

- Position sents operation

- beats buffered dress issue

- beats buffered dress issue

- beats buffered dress issue



Input Formats:
16-bit per charmed (1-) charmeds per component) 616/343/243 Mpix.
16-bit per charmed 16-bit (for conversion to 907)

Design Issues (cont.)

からないないできないのではなっていることなるとなるとなるのではないないできませんというこうこ

The state of the s

12-bit per charred (2 or 4 charreds per component) 410/242 Mpix Used pred 12-bit Signed 12-bit Argon Format Projectors (1590a2048) require pixel retes of 500HB Transfer DF of 405Mplx/sec

Calligraphice:
Push-sods response packet
D thinks its in wapite sode
Just pess the data to DVO

Color Nap 1 47 Nap 4 255 Napa 1 16 Nap (PUP)

After digital video in one port to be passed to the other DAC port.

Overlay Compositing One channel is heaplare information Other is everlay with alpha Compositing happens in the magnify unit-DAC Fraguety 2508ts per charmed 5008ts by wading two ploads per charmed

alincater Asist - Alice for eccesing petches of R-chips.

Dave Naegle Line Baffers/Hagnify/Gamma/Cursor/Clocks/Hyperpipe Systery Person #3 Chesa Interface/DID/roster/global Ed Matchins Verification & VOF Specification Mystery Person #2 Simulation & Board Design Mystery Person #1 VOP/Genlock Design 4 PD Andrew Bowen Imput/CKAPs/Minify Ed's systery helper Helping Ed Mho's doing what: Dan McLachlan Xanager

27.48 of total - Inc. Minify The gree to totally driven by the atten-In term of sec. \$500 137m2 20. 23m2 50. 25m2 - 95m3 100 of RAW is topat buffers. total\_ram 1517.1K bite 0.24 of total 31.48 of total 2.94 of total 2.28 of total 0.00 of total 16.59 of total 0.28 of total 0.10 of total 0.3t of total 1.88 of total 0.00k bitte 0.00k bite 517.10k bits 62.00k blte 16.00k bite 0.00k bits 216.00k bdts 0.00k bite 0.00k bits 0.00k bite 0.00k bits ures Datinate: 17.63 am x 17.63 am total 18 x 6 \* 108 48 pect x 4 \* 193 20 lvtl x 2 \* 46 6 6 0/90/91 3 10.6% gates 5.6% gates 21.5E getes 9.2E getee Video Receive: 227.4% getes 10.5K gates 126.2E getes 25.2K getos 21.2K getes 13.2K getes 201.2K geter cotal\_gates: 631K gates DAC Date DAC Centrol DAC Centrol 210 Control Line Safferer Operation Nax Clobal Logic YUF CONVERT Pla Counter Canhe 700te Megality

| A mend L (one top required);<br>(0x502) 2                           | 222222                       | ****** | 222222222 | RGB16<br>RGB16<br>RGB16<br>RGB16<br>RGB16<br>RGB16<br>RGB16<br>RGB16<br>RGB16<br>RGB16 (NO Stereo) |
|---------------------------------------------------------------------|------------------------------|--------|-----------|----------------------------------------------------------------------------------------------------|
| quired):                                                            |                              | *****  | 222222222 | B16<br>2816<br>8816*<br>8816<br>8816<br>8816<br>8816<br>1810<br>8810<br>8810<br>8810               |
| quired):                                                            |                              | *****  | 2222222   | 2816<br>2816<br>2816<br>2816<br>2816<br>2816<br>2816<br>2816                                       |
| quired):                                                            |                              | *****  | 288888    | 316<br>316<br>316<br>316<br>316<br>316<br>316<br>316<br>316                                        |
| - dufred):                                                          |                              | ****   | 22222     | 3816<br>3816<br>3816<br>1116<br>10 stereo                                                          |
| quired):                                                            |                              | ***    | 8885      | 1816<br>1816<br>116<br>16 (no stereo)                                                              |
| quired):                                                            |                              | 777    | 882       | 1816<br>116<br>16 (no stereo)                                                                      |
| 180<br>180<br>required):                                            |                              | 77     | 82        | 116<br>16 (no stereo                                                                               |
| quired):                                                            |                              | ×      | 2         | (e (no stereo)                                                                                     |
| required):                                                          |                              |        |           |                                                                                                    |
|                                                                     |                              | 7      | 8         | 116 + B16                                                                                          |
|                                                                     | 2/4                          | 7      | 2         | NC16 + 816                                                                                         |
| 390XX048_/4 × 416                                                   |                              | ×      | 8         | 112 + 816                                                                                          |
| Pixel Packet Benchidther                                            |                              |        |           |                                                                                                    |
| NGBALS 237 N<br>NGBALS 237 N<br>NGL6/BAL6 2343 N<br>NGL2/BALS 441 N | %0/s<br>%0/s<br>%0/a<br>%0/a |        |           |                                                                                                    |







SGI Confidential







|                           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Bali Configurations                            | attons                                           |                                                 |
|---------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------|--------------------------------------------------|-------------------------------------------------|
|                           | Balito                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | Bell                                           | Balon                                            | Balonster                                       |
| Cost                      | 8000                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 800-1-00                                       | 100-25083                                        | 200-100060                                      |
| Packaging                 | pergraph of the person of the | Mediting<br>Designate<br>Ball Box<br>Nede Box  | Nack<br>Ball Box<br>+ Node Berra                 | Multi-Pack<br>Bell Borce<br>*<br>Node Bases     |
| Power                     | 41000H<br>110V/16A                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 2000W                                          | >2007/20A                                        | >2000W<br>Deficated                             |
| Graphics                  | Only TABY<br>IG/BR/1D/2Ch                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Up To TMAX<br>BC/SGR/BD                        | Up The TAUST<br>80/84R/80                        | Wale TAXX<br>80/648/60                          |
| Processors                | 7                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 1                                              | 2                                                | 8-4096                                          |
| Duplo<br>Connection       | hoternal<br>hardwired                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | In-Pedrage<br>Cable                            | In-Reck<br>Cabbs                                 | Party<br>Multi-Rack<br>Cobbs                    |
| Q                         | Base<br>PCI expusitio                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | Extended Base<br>PG expansion<br>External Xiox | Extended Base<br>KCI expansion<br>External Xibor | Extended Base<br>PCI expansion<br>External Xbox |
| Disks                     | CD<br>+2.9.5*<br>+cafemal                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | CD<br>+5 tytomal<br>+extlornal                 | 45 Internal<br>eestlemal                         | CD<br>+ Strikmad<br>+ externad                  |
| Sales<br>(01,656,000,000) | 15k/yr                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 3-58();                                        | 250,51                                           | 10/yr                                           |
| 11/1                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                                |                                                  | Line II Nov 98                                  |
|                           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                                |                                                  |                                                 |



SGI Confidential

62









63





The Majorian Linggin Treates

1. Integrated - Lorent Comp.

2. Mobile - aligned Practical

3. Deline Integrated and publishing off follow

4. Deline Integration acceptance and publishing off follow

4. Deline Integration acceptance from the Majorian and Majorian acceptance from the Majorian Strategy Committee, etc.)

5. Kept and Kear represent current database for 10, Generaly etc.

1. Integrational provincement current database for 10, Generaly etc.

2. Devery Complex operational position of the Majorian and Strate supply beneaty for the Majorian Strategy Complexity end certificate of the Majorian Strategy Complexity end certificate of the Majorian Strategy Complexity and Strategy Complexity and Strategy Complexity and Complexity of the Majorian Strategy Complexity and Complexity Complexity and Complexity Complexity and Complexity Com



























Support for 16 samples per pixel at reduced speed



small = med from prev slides.

ୡ

note: This is one of several possible tiling's.

Distribution of M's and DRAM banks within an 8x8 screen tile